10 research outputs found

    The geography of recent genetic ancestry across Europe

    Get PDF
    The recent genealogical history of human populations is a complex mosaic formed by individual migration, large-scale population movements, and other demographic events. Population genomics datasets can provide a window into this recent history, as rare traces of recent shared genetic ancestry are detectable due to long segments of shared genomic material. We make use of genomic data for 2,257 Europeans (the POPRES dataset) to conduct one of the first surveys of recent genealogical ancestry over the past three thousand years at a continental scale. We detected 1.9 million shared genomic segments, and used the lengths of these to infer the distribution of shared ancestors across time and geography. We find that a pair of modern Europeans living in neighboring populations share around 10-50 genetic common ancestors from the last 1500 years, and upwards of 500 genetic ancestors from the previous 1000 years. These numbers drop off exponentially with geographic distance, but since genetic ancestry is rare, individuals from opposite ends of Europe are still expected to share millions of common genealogical ancestors over the last 1000 years. There is substantial regional variation in the number of shared genetic ancestors: especially high numbers of common ancestors between many eastern populations likely date to the Slavic and/or Hunnic expansions, while much lower levels of common ancestry in the Italian and Iberian peninsulas may indicate weaker demographic effects of Germanic expansions into these areas and/or more stably structured populations. Recent shared ancestry in modern Europeans is ubiquitous, and clearly shows the impact of both small-scale migration and large historical events. Population genomic datasets have considerable power to uncover recent demographic history, and will allow a much fuller picture of the close genealogical kinship of individuals across the world.Comment: Full size figures available from http://www.eve.ucdavis.edu/~plralph/research.html; or html version at http://ralphlab.usc.edu/ibd/ibd-paper/ibd-writeup.xhtm

    Semantically linking molecular entities in literature through entity relationships

    Get PDF
    Background Text mining tools have gained popularity to process the vast amount of available research articles in the biomedical literature. It is crucial that such tools extract information with a sufficient level of detail to be applicable in real life scenarios. Studies of mining non-causal molecular relations attribute to this goal by formally identifying the relations between genes, promoters, complexes and various other molecular entities found in text. More importantly, these studies help to enhance integration of text mining results with database facts. Results We describe, compare and evaluate two frameworks developed for the prediction of non-causal or 'entity' relations (REL) between gene symbols and domain terms. For the corresponding REL challenge of the BioNLP Shared Task of 2011, these systems ranked first (57.7% F-score) and second (41.6% F-score). In this paper, we investigate the performance discrepancy of 16 percentage points by benchmarking on a related and more extensive dataset, analysing the contribution of both the term detection and relation extraction modules. We further construct a hybrid system combining the two frameworks and experiment with intersection and union combinations, achieving respectively high-precision and high-recall results. Finally, we highlight extremely high-performance results (F-score > 90%) obtained for the specific subclass of embedded entity relations that are essential for integrating text mining predictions with database facts. Conclusions The results from this study will enable us in the near future to annotate semantic relations between molecular entities in the entire scientific literature available through PubMed. The recent release of the EVEX dataset, containing biomolecular event predictions for millions of PubMed articles, is an interesting and exciting opportunity to overlay these entity relations with event predictions on a literature-wide scale

    Human populations are tightly interwoven

    No full text

    The Y chromosome as the most popular marker in genetic genealogy benefits interdisciplinary research

    No full text
    The Y chromosome is currently by far the most popular marker in genetic genealogy that combines genetic data and family history. This popularity is based on its haploid character and its close association with the patrilineage and paternal inherited surname. Other markers have not been found (yet) to overrule this status due to the low sensitivity and precision of autosomal DNA for genetic genealogical applications, given the vagaries of recombination, and the lower capacities of mitochondrial DNA combined with an in general much lower interest in maternal lineages. The current knowledge about the Y chromosome and the availability of markers with divergent mutation rates make it possible to answer questions on relatedness levels which differ in time depth; from the individual and familial level to the surnames, clan and population level. The use of the Y chromosome in genetic genealogy has led to applications in several well-established research disciplines; namely in, e.g., family history, demography, anthropology, forensic sciences, population genetics and sex chromosome evolution. The information obtained from analysing this chromosome is not only interesting for academic scientists but also for the huge and lively community of amateur genealogists and citizen-scientists, fascinated in analysing their own genealogy or surname. This popularity, however, has also some drawbacks, mainly for privacy reasons related to the DNA donor, his close family and far-related namesakes. In this review paper we argue why Y-chromosomal analysis and its genetic genealogical applications will still perform an important role in future interdisciplinary research
    corecore